182 research outputs found

    Nanowire Volatile RAM as an Alternative to SRAM

    Full text link
    Maintaining benefits of CMOS technology scaling is becoming challenging due to increased manufacturing complexities and unwanted passive power dissipations. This is particularly challenging in SRAM, where manufacturing precision and leakage power control are critical issues. To alleviate some of these challenges a novel non-volatile memory alternative to SRAM was proposed called nanowire volatile RAM (NWRAM). Due to NWRAMs regular grid based layout and innovative circuit style, manufacturing complexity is reduced and at the same time considerable benefits are attained in terms of performance and leakage power reduction. In this paper, we elaborate more on NWRAM circuit aspects and manufacturability, and quantify benefits at 16nm technology node through simulation against state-of-the-art 6T-SRAM and gridded 8T-SRAM designs. Our results show the 10T-NWRAM to be 2x faster and 35x better in terms of leakage when compared to high performance gridded 8T-SRAM design

    LoGPC: Modeling Network Contention in Message-Passing Programs

    Get PDF
    In many real applications, for example those with frequent and irregular communication patterns or those using large messages, network contention and contention for message processing resources can be a significant part of the total execution time. This paper presents a new cost model, called LoGPC, that extends the LogP [9] and LogGP [4] models to account for the impact of network contention and network interface DMA behavior on the performance of message-passing programs. We validate LoGPC by analyzing three applications implemented with Active Messages [11, 18] on the MIT Alewife multiprocessor. Our analysis shows that network contention accounts for up to 50% of the total execution time. In addition, we show that the impact of communication locality on the communication costs is at most a factor of two on Alewife. Finally, we use the model to identify tradeoffs between synchronous and asynchronous message passing styles. 1 Introduction Users of parallel machines need good performa..

    Skybridge: 3-D Integrated Circuit Technology Alternative to CMOS

    Full text link
    Continuous scaling of CMOS has been the major catalyst in miniaturization of integrated circuits (ICs) and crucial for global socio-economic progress. However, scaling to sub-20nm technologies is proving to be challenging as MOSFETs are reaching their fundamental limits and interconnection bottleneck is dominating IC operational power and performance. Migrating to 3-D, as a way to advance scaling, has eluded us due to inherent customization and manufacturing requirements in CMOS that are incompatible with 3-D organization. Partial attempts with die-die and layer-layer stacking have their own limitations. We propose a 3-D IC fabric technology, Skybridge[TM], which offers paradigm shift in technology scaling as well as design. We co-architect Skybridge's core aspects, from device to circuit style, connectivity, thermal management, and manufacturing pathway in a 3-D fabric-centric manner, building on a uniform 3-D template. Our extensive bottom-up simulations, accounting for detailed material system structures, manufacturing process, device, and circuit parasitics, carried through for several designs including a designed microprocessor, reveal a 30-60x density, 3.5x performance per watt benefits, and 10X reduction in interconnect lengths vs. scaled 16-nm CMOS. Fabric-level heat extraction features are shown to successfully manage IC thermal profiles in 3-D. Skybridge can provide continuous scaling of integrated circuits beyond CMOS in the 21st century.Comment: 53 Page

    Realization of Four-Terminal Switching Lattices: Technology Development and Circuit Modeling

    Get PDF
    Our European Union’s Horizon-2020 project aims to develop a complete synthesis and performance optimization methodology for switching nano-crossbar arrays that leads to the design and construction of an emerging nanocomputer. Within the project, we investigate different computing models based on either two-terminal switches, realized with field effect transistors, resistive and diode devices, or four-terminal switches. Although a four-terminal switch based model offers a significant area advantage, its realization at the technology level needs further justifications and raises a number of questions about its feasibility. In this study, we answer these questions. First, by using three dimensional technology computer-aided design (TCAD) simulations, we show that four-terminal switches can be directly implemented with the CMOS technology. For this purpose, we try different semiconductor gate materials in different formations of geometric shapes. Then, by fitting the TCAD simulation data to the standard CMOS current-voltage equations, we develop a Spice model of a four-terminal switch. Finally, we successfully perform Spice circuit simulations on four-terminal switches with different sizes. As a follow-up work within the project, we will proceed to the fabrication step.This work is part of a project that has received funding from the European Union’s H2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 691178, and supported by the TUBITAK-Career project #113E760

    Integrated Synthesis Methodology for Crossbar Arrays

    Get PDF
    Nano-crossbar arrays have emerged as area and power efficient structures with an aim of achieving high performance computing beyond the limits of current CMOS. Due to the stochastic nature of nano-fabrication, nano arrays show different properties both in structural and physical device levels compared to conventional technologies. Mentioned factors introduce random characteristics that need to be carefully considered by synthesis process. For instance, a competent synthesis methodology must consider basic technology preference for switching elements, defect or fault rates of the given nano switching array and the variation values as well as their effects on performance metrics including power, delay, and area. Presented synthesis methodology in this study comprehensively covers the all specified factors and provides optimization algorithms for each step of the process.This work is part of a project that has received funding from the European Union’s H2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 691178, and supported by the TUBITAK-Career project #113E76
    corecore